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Abstract — In practice, since many communication networks are 
huge in scale, or complicated in structure, or even dynamic, 

, the predesigned hnear network codes based on the network 
topology is impossible even if the topological structure is known. 
Therefore, random linear network coding has been proposed as 
an acceptable coding technique for the case that the network 
topology cannot be utilized completely. Motivated by the fact 
that different network topological information can be obtained 
for different practical applications, we study the performance 
analysis of random linear network coding by analyzing some 
failure probabilities depending on these different topological 
information of networks. We obtain some tight or asymptotically 
tight upper bounds on these failure probabilities and indicate 
the worst cases for these bounds, i.e., the networks meeting the 

' upper bounds with equality. In addition, if the more topological 
information of the network is utilized, the better upper bounds 
are obtained. On the other hand, we also discuss the lower bounds 

' on the failure probabilities. 

I I. Introduction 

Network coding was proposed by Ahlswede et al. [1|, 
which shows that if coding is applied at the nodes instead of 

■ routing alone, the source node can multicast the information 
. to all sink nodes at the theoretically maximum rate. Li et al. 
' II2I further indicated that linear network coding with finite 

■ alphabet size is sufficient for multicast. Koetter and Medard 
' fSl presented an algebraic characterization of network coding. 

Although network coding allows the higher information rate 
than classical routing, Jaggi et al. Q still proposed a deter- 
ministic polynomial-time algorithm for constructing a linear 
network code. Random linear network coding was introduced 
i by Ho et al. Q as an acceptable coding technique for many 
' communication problems, particularly, for the case that the 
network topology cannot be utilized completely, because it 
is impossible to use predesigned network codes. Their main 
results are upper bounds on different failure probabilities 
which characterize the performance of random linear network 
coding. Balli et al. Q improved on these bounds and analyzed 
their asymptotic behavior as the field size goes to infinity. 
However, these upper bounds are not tight. In order to charac- 
terize the performance of random linear network coding more 
comprehensively and completely, Guang and Fu |7| introduced 
and studied the average failure probability of random linear 
network coding. In this paper, we further discuss the random 
linear network coding and improve on the bounds for different 



cases. In particular, if the more knowledge about the topology 
of the network is known, we can obtain the better bounds. 
Further, we indicate that these bounds are either tight or 
asymptotically tight. 

A communication network is represented by a finite acyclic 
directed graph G — (V, E), where V and E are the sets of 
nodes and channels of the network, respectively. A direct edge 
e = E E stands for a channel leading from node i to 

node j. Node i is called the tail of e and node j is called 
the head of e, denoted by tail{e) and head{e), respectively. 
Correspondingly, the channel e is called an outgoing channel 
of i and an incoming channel of j. For a node i, define 
Out{i) = {e e E : tail{e) = i}, and In{i) = {e E E : 
head{e) — i}. We allow the multiple channels between two 
nodes and assume that one field symbol can be transmitted 
over a channel in a unit time. In this paper, we only consider 
networks with single source, and the unique source node is 
denoted by s, which generates messages and transmits them 
to all sink nodes t £ T by network coding, where T is the set 
of sink nodes. Denote C* the minimum cut capacity between 
the source node s and the sink node t. Let the information 
rate be w symbols per unit time which means that the source 
messages are w symbols X — {Xi, X2, ■ ■ ■ ,Xw) arranged 
in a row vector where each Xi is an element of the finite 
base field In this paper, we always assume that w < Ct 
for any t E T. We use Ue to denote the message transmitted 
over channel e — and Ue is calculated by the following 
formula Ue — J2dein(i) kd,eUd, where kd.e G is called the 
local encoding coefficient for the adjacent pair of channels 
(d, e). Further, it is not difficult to see that Ue, actually, is a 
linear combination of the w source symbols Xi, 1 < i < w, 
that is, there is an w-dimensional column vector fe over the 
base field such that Ue — ^ ■ fe (see |8| f9l). This column 
vector fe is called the global encoding kernel of a channel e 
and it can be determined by the local encoding coefficients. 
Further, at the sink node t E T, all global encoding kernels and 
received messages of incoming channels are available. Define 
an w X matrix Ft and an |/7i(f)| -dimensional vector 

At as Ft ^ {fe : e E In{t)) and At = {Ue : e E In{t)). 
Then we have decoding equation At = X • Ft, which implies 
that the sink node t can decode (recover) the original source 
message vector X successfully if and only if Rank(Ft) = w. 



The main idea of random linear network coding is that 
when a node (maybe the source node s) receives the messages 
from its all incoming channels, for each outgoing channel, it 
randomly and uniformly picks the encoding coefficients from 
the base field J^, uses them to encode the received messages, 
and transmits the encoded messages over the outgoing channel. 
In other words, the local coding coefficients kd.e are inde- 
pendently and uniformly distributed random variables taking 
values in the base field J'. Since random linear network coding 
does not consider the global network topology or coordinate 
coding at different nodes, it may not achieve the best possible 
performance of network coding, that is, some sink nodes may 
not decode correctly. Therefore, the performance analysis is 
very important both theoretically and for application. Before 
proceeding further, we first introduce the definitions of the 
failure probabilities in order to characterize the performance 
analysis of random linear network coding. 

Definition 1: For random linear network coding on G, 
. Pe = Pr{3 t e T such that Rank(Ft) < w) is called 
the failure probability of random linear network coding 
for network G, that is the probability that the messages 
cannot be decoded correctly at at least one sink node in 
T. 

• Pet — Pr(Rank(Pf) < w) is called the failure proba- 
bility of random linear network coding at sink node t, 
that is the probability that the source messages cannot be 
decoded correctly at the sink node t G T, 

II. Failure Probability for Networks 

In this section, we will present our main results on the fail- 
ure probability of random linear network coding for network 
G, where G is any fixed network with single source s. Let T = 
{ti,t2, ■ ■ ■ ,ti} he the set of sink nodes. For each sink node 
ti G T, by Menger's Theorem, there exist w channel disjoint 
paths from s to ti as w < Ct^ ■ Denote the collection of the ar- 
bitrarily chosen w paths for ti by Vi = {/i,i, -^1.2, • • • , Pi.iu}, 
where the path = {e^ j,2, • • • , ej j,™, J satisfying 
tail{eij^i) = s, head{eij,mi j) — U, and tail{eij,k) ~ 
head{eij^k-i) for others. Obviously, it is possible that Vi f) 
Vj ^ for distinct sink nodes ti and tj. Let be the 
number of the internal nodes in Vi and R be the num- 
ber of the internal nodes in Uti^rPi — ^\=iPi- Clearly, 
maxi<i<;ri < R < Yl\=i'''i- Denote the R internal nodes 
by 11,12, - ■ ■ ,iR and let the ancestrally topological order be 

S = ^il < 12 < ■ ■ ■ ^ IR ~< {tl,t2, - ■ ■ ,ti}. 

During our discussion, we use the concept of cuts of 
the paths from s to t introduced in f^l and f?!, which is 
different from the concept of cuts of networks in graph 
theory. For each ti, the first cut CUTi o is the set of the 
w imaginary channels, i.e., CUTi^ — In{s). At s, the next 
cut CUTi^i is the set of the first channels of all w paths, 
i.e., CUTi^i = {ei,i4, e,;^2,i, • • • ,ei^ni,i}- At node ii, the 
next cut CUTi^2 is formed from CUTi i according to the 
following method: if Iniii) fl CUTi^i ^ 0, then replace 
the channels in In{ii) fl CUTi i by their respective next 
channels in the paths, other channels remain the same as 



in CUT.x, otherwise if In{ii) n CUT.^i = 0, CUT,^2 
remains the same as CUTi i. In the same way, once CUTi k 
is defined, CUTi^k+i is formed from CUTi k by the same 
method above. By induction, all cuts CUTi^k can be defined 
for i = 1,2, - ■■ ,1 and k = 0, 1, • • • , i? + 1. Particularly, 
note that CUTi^R+i = {ei,i,mi,i,ei^2,mi,2r ■ ■ ,(ii,w,mi,^}, 
that is the set of the last channels of all w paths from s 
to ti. Further, for each node ik, k — 0,1,2, R, define 
CUT°l* = {e : e e CUTt^k \ In{ik)}, and two sets 
Mk ='{t, : CUT,,k ^ CUT,,k+i} and Nk = {t, : GC/T,,,. ^ 
GC/T,,fe+i, and CUT,^k+i = CUT,^r+i}. In fact. Ah is the 
set of sink nodes satisfying that at least one of its w paths 
passes through the node ik, and Nk is the set of the sink nodes 
satisfying that at least one of its w paths passes through the 
node ik and ik is the last internal node on its w ^aths. Fur- 
thermore, let \Mk\ — TUk and |A^fc| = Uk- Then J2k=o ~ 
Eto™fc = iri + l) + ir2 + l) + --- + iri + l)^j:\=in + l, 

and thus I]fe=o("*fe ^ = Y.i=i "^i- 

In order to illustrate the concepts introduced, we take the 
butterfly network Gi (FigUli as an example. For sink nodes 




Fig. L Butterfly Network Gi 



tut2 e T, let Vi = {Pi.i,Pi.2} and V2 = {^2,1,^2,2}, 
where Pi,i = {£1,63}, Pi, 2 = {62,65,67,68}, P2,i = 
{61,64,67,69}, and Pi, 2 = {62,65}. Then 



G[/Ti,o - 


{di,d2}. 


CUT°f = 


0, 


Gt/Tia = 


{61,62}, 




{62}, 


CUTi^2 = 


{63,62}, 


Gt/T™* = 


{63}, 


G[/ri,3 = 


{63,65}, 


CUTIf = 


{63}, 


CUTiA = 


{63,67}, 




{63}, 


G[/ri,5 = 


{63,68}, 


cutout ^ 


0; 


CUT2,o = 


{^1,^2}, 


CUT°f = 


0, 


CUT2S = 


{61,62}, 


CUT°f = 


{62}, 


CUT2^2 = 


{64,62}, 


CUT^f = 


{64}, 


G[/r2,3 = 


{64,65}, 


CUT^f = 


{65}, 




{67,65}, 


CUT°X = 


{65}, 


G[/r2,5 - 


{69,65}, 


CUT°f = 


0; 


Mo = Ml 


= A/2 = 


M3 = A/4 


--{ti,t2}, 



No = Ni^N2 = N3^ 0, A^4 - {h, t2}. 



Theorem 1: The failure probability of random linear net- 
work coding for the network G satisfies: 



Pe < 1 - (1 - a)' n [1 ~ ("^'^ ' 



fe=0 



where a^l~ JlLill " ITF^' 

Before giving the proof, we need the following lemma. 

Lemma 2: Let C be an n-dimensional linear space over a 
finite field Cq, Ci be two subspaces of C of dimensions 
fco, ki, respectively, and (£oU£i) — C. Let li, h, ■ ■ ■ , Ui-ka 
be {n — ko) independently and uniformly distributed random 
vectors taking values in Ci. Then 

Pridimi{C„U{h, • • • , L-ko})) = n) = 11 ( ^ " ^ ) ' 

Proof of Theorem [7} For each sink node ti G T, 
recall that the matrix Ft^ = (/e : e e In{t)) of size 
w X is the decoding matrix of ti, which further is 

denoted by Fi for convenience. Further, Define w w y. w 

matrix F^ = (/ei,i.,„, ^ , /ei.2,™,_2 ' ' ' ' ' f^i.^.^i,^ )' '^^^''^ ""^'^^^l 
that eijjm , 1 < J < are the last channels of the chosen 
w channel disjoint paths from s to ti. It is readily seen that F/ 
is a submatrix of Fi. So the event "Rank(i^i) < w" implies 
the event "Rank(_F!/) < w". Hence 

Pe = Fr(U^^iRank(^;) < w) < Pr{\j\^-^Rank{Fl) < w). 

In addition, let f/''^ (/e : e G CUT.^u) be u; x w 
matrices for i = l,2,---,l and k = 0, 1, • • • , i? + 1. 
If Rank(F/''^) < w, we call that we have a failure at 
CUT.^k and the event "Rank(F/''^) = w" is denoted 
by r,,fc. Note that F^ = ^ince CUT,m.+i = 

{ei,i,mi_i,ei,2,mi,2,--- ,ei,^,„,„}. Then it further follows 
that 

Pe < 1 - Pr(ntiRank(i^') = w) = I - Pr(ntir^,i?,+i). 
Next, we consider the probability Pr{r\\^^Ti^B.+i)- First, 

/'KnLir.,i?+i) 



^/^KntiL^.o) • n PrH=l^^,k+l I nti r,,fe) (i) 

A:=0 

R 

= [|Pr(ntir,,fc+i|ntir,,fc), (2) 



fc=0 



where ([T]i follows because encoding at any node is independent 
of what happened before this node as long as no failure has oc- 
curred up to this node, and ([2} follows from Pr{r\\^^Tifl) — 
Pr(Rank(/u,xu)) = w) = 1. Subsequently, we take into 



account the probability Pr{fii^-yTi^k+i \ n-^^ ^i,k)- Actually, 

Pr(ntir^,fc+i| n^;^i r,,,.) = Pr{^u^M,T,^u+i\ nLi r..^) 

■ n Pr(r,,fc+i|ntir,,fc) (3) 
=Pr(a^.eM,-jv,r„fe+i|ntir,,fc) [] Pr(r,,fc+i|r,,fc), 

where (O follows because for ti G A^fc, the event Ti^k+i is con- 
ditional independent with r\t-£Mk-{ti}^ j,k+'^ under the condi- 
tion r\\^^Ti^k- Reasonably, put Y{tieN^P'^i^i,k+i\^i,k) = 1 
for Nk = 0, and put Prl^t^eM^Nj^j^k+A r\\^i r»,fc) = 1 
for Mk — Nk = 0. Further applying Lemma |2] one has 

Pr{r\t^eM^^Nj^],k+i \ nLi ^^■,k) 
=1 - Pr(Ui^eM.-Ar,r^^,fc+il nti r,,fc) 



>i 
=1 

=1 



E [i-Pr(r,.fc+i|nUr..fc)] 
E [i-Pr(r,-fc+i|r,-fe)] 

w-\CUT°X\ 

E 1 n 

tjEMk-Nk h=l 



F' 



>1 - (TOfc - rife) 
On the other hand. 



h=l 



1 - (TOfe - nk)a. 



n Pr(r,,fc+iir,,fc)- n H i 



=(1 -a)"^ 



./i=i 



Combining the above, it follows that 

Pr(nLir,,fc+i I nti r,,fc) > (l - a)"^ (l - {mk - nk)a), 
which implies that 



^r•(nLl^^,i^+l) > n PrinUr,,k+i\ nLi r^.O 

R 

>Y[{l-aT''{l-{mk-nk)a) 

k=0 

R 

= (l_a)ELo"'c Y[(l - {nik - nk)a) 

k=0 

R-1 

= (1 - a)' n (1 " ("^'^ - 



*;=0 



where the last equahty (|4|l follows from J2k^=o = I and 
nT-R = n^. So the proof is completed. ■ 
Remark 3: This upper bound on the failure probability for 
network is achievable. We wiU give a specific network below 
to show the tightness. For a given information rate w, the 
network G2 is constructed as follows. Let the unique source 
node be s, the sink nodes be ^1,^2, • • • ,ti- Construct a plait 
network G'l (see Fig. ^ with R internal nodes for the sink 
node <i, and plait networks G'j without internal nodes for other 
sink nodes tj [j = 2, 3, • • • , /). These I plait networks share 



a common source node s, i.e., 
of the I plait networks G'^ {j - 

w channels w channels 



the network G2 is the union 
1, 2, 3, • • • , I). After a simple 

w channels 




Fig. 2. Plait Networlc witli r internal nodes 



calculation, it is not difficult to obtain 



Pe{G2) = 1 



n 

.4=1 



1 



1 



R+l 



l-(l-a) 



R+l 



which meets the upper bound in Theorem [T] with equality. 

However, this upper bound may require too much topolog- 
ical information of networks for many applications. Thus, we 
will give a simpler in form but looser upper bound depending 
on less topological information of networks. 

Theorem 4: The failure probability of random linear net- 
work coding for the network G satisfies: 

-Pe < 1 - (1 - a)'(l - laf{l - ua), 

where X]i=i ''i ~ lb + u with b, u being two nonnegative 
integers satisfying Q < u < I — 1 and again being the 
number of internal nodes in the chosen w channel disjoint 
paths from s to t^, i = 1, 2, • • • , L 

Sometimes, we may not acquire the exact value of the sum 
of these r^, but usually we still can obtain some topological 
information of networks more or less. For example, although 
we cannot know the sum of these r^, an upper bound n may 
be found, that is, we can find an integer n satisfying n > 
Y^\=i^i- Let n = lb + ii, where b,u are two nonnegative 
integers satisfying < m < ^ — 1. Since lb + u > lb + u, 
after a simple calculation, one has (1 - la)'>{l - iia) < (1 - 
la)'>{l-ua). 

Theorem 5: For the network G, let Vi be the number of 
internal nodes in w channel disjoint paths from s to ti. If 
X]i=i — '^hen the failure probability of random linear 
network coding for the network G satisfies: 

-Pe < 1 - (1 - a)'(l - laf{l - ua), 

where n = lb + ii with b, u being two nonnegative integers 
satisfying < w < / — 1. 



In particular, for each sink node ti G T, if we can choose 
those w channel disjoint paths which contain the minimum 
number of the internal nodes among the collection of all w 
channel disjoint paths from s to ti, and denote this minimum 
number by Ri, then we can obtain a smaller upper bound than 
that in Theorem |4] and having the same simple form. 

Corollary 6: The failure probability of random linear net- 
work coding for the network G satisfies: 

P'e < 1 - (1 - a)'(l - la)'''{l - u'a), 

where similarly X]i=i = + ^i'^^ b', u' being two 
nonnegative integers satisfying Q < u' <l ~ 1. 

Remark 7: Unfortunately, we cannot show the tightness of 
the upper bounds indicated in Theorems |4] |5] and Corollary 
[5] Actually, we guess that the upper bounds are not tight. 
However, motivated partly by Q, we want to study the 
asymptotic behavior of the failure probabilities as the field 
size goes to infinity, because some complicated minor terms 
may be ignored during the derivation. So we can get a deeper 
understanding of the failure probability and find main factors 
influencing this probability. Actually, these upper bounds are 
asymptotically tight. 

Furthermore, it is apparent that the number R does not 
exceed the number of the internal nodes | J|. Hence, we obtain 
the following theorem. 

Theorem 8: The failure probability of random linear net- 
work coding for the network G satisfies for m > | J|: 

Pe < l-(l-a)'(l-M", 
Particularly, if the number of the internal nodes is known, 

Pe<l-{1 -a)l^l(l - /a)!-'!. 

Remark 9: The upper bound stated in Theorem [8] is also 
asymptotically tight. 

Next, we consider a linear network coding problem N* 
which can be fully characterized by the network G, the source 
node s, the set T of sink nodes, and the information rate 
w < mintg^Gf. Thus it can be written as N* = {G = 
{V, E), s,T,w< minteT GJ. Define 



rj(N*) 



lim sup \J-\ 

L7-"|— >oo 



P. 



which characterizes the limiting behavior of the failure prob- 
ability for the network as the field size goes to infinity. 

Denote by A^* ^ the set of all linear network coding 
problems N* satisfying the following conditions: 

1) the number of sink nodes is I, 

2) for all sink node ti, 1 < i < I, there exist w channel 
disjoint paths from s to each ti with internal nodes, 
satisfying that the sum of all does not exceed a fixed 
number n. 

Define 



n,l) 



max rj(N*), 



which characterizes the worst case limiting behavior of the 
failure probability for the network in M-^i- 



Moreover, denote by 7V^ ; the set of all linear network 
coding problems N* with a fixed number of internal nodes 
\J\ = m and a fixed number of sink nodes \T\ = I. Define 



("1,0 



max n(N*), 



which characterizes the worst case limiting behavior of the 
failure probability for the network in AA,* 

From Theorem |5] and Remark Q as well as Theorem |8] and 
Remark |9] we derive the following theorem. 

Theorem 10: For single source multicast random linear 
network coding, we have 



(n,l) 



\T\ + n and =|r|(l + |J|). 

III. Failure Probability at Sink Node 



In this section, we further give the results on the failure 
probability at a sink node which appeared in 1.11 J partly. 

Theorem 11: For the network G mentioned as above, the 
failure probability of random linear network coding at sink 
node t e T satisfies: 

r w-\CUT'^X\ / 



^^.^1 n n 



1 - 



1 



fc=0 1=1 \ / 

This upper bound is tight for some networks such as 
the well-known butterfly network ifTOl . However, the upper 
bound may be too complicated for applications and too much 
topological information of the network may be required. So 
we give a simpler in form but looser upper bound as follows. 

Theorem 12: For the network G, the failure probability of 
random hnear network coding at sink node i € T satisfies: 

^1 

1 



P., < 1 



n 

.4=1 



1 



where r is the number of internal nodes for some collection 
of w channel disjoint paths from s to t. 

For some applications, we cannot know the number of 
internal nodes r, we can get an upper bound n on the number 
r of internal nodes, i.e., n > r. For this case, we can also 
analyze the failure probability at the sink node t. 

Theorem 13: For the network G, if r < n, then the failure 
probability of random linear network coding at the sink node 
t E T satisfies: 



P., < 1 



Particularly, 



P. . < 1 



-I n+l 



\J\ + 1 



The upper bounds in Theorems [T2] and [T3] are also achiev- 
able for the plaint networks as the worst case. Further, consider 
a hnear network coding problem N* and define 



PlimJ) 



max 



p.. 



which characterizes the maximum value of the failure proba- 
bility of random network coding at the sink node among all 
linear network coding problems N* with |r| = I and | J| = m. 

Theorem 14: For linear network coding problems in Mm,* , 

m+l 



p;(m,o-i 



n 



1 



1 



> 



> 



IV. Lower Bounds on The Failure Probabilities 

In addition, we can also give the lower bound on the failure 
probabilities. 

Theorem 15: Using random linear network coding for a 
single source multicast network G, then 

« the failure probability at the sink node satisfies: Pet 

« the failure probability for the network G satisfies: 
l/\J-\^^^, where 5 — mintgy <5t with 6t — Ct — w. 

Remark 16: Actually, both lower bounds above are also 
asymptotically achievable. Moreover, by the lower bounds on 
the failure probabilities, we still can obtain the conclusion 
proposed in |5 1, that is, both failure probabilities tend to zero 
as the size of the base field goes to infinity. 
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